A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal Difference Learning

نویسندگان

David Choi

Benjamin Van Roy

چکیده

The traditional Kalman filter can be viewed as a recursive stochastic algorithm that approximates an unknown function via a linear combination of prespecified basis functions given a sequence of noisy samples. In this paper, we generalize the algorithm to one that approximates the fixed point of an operator that is known to be a Euclidean norm contraction. Instead of noisy samples of the desired fixed point, the algorithm updates parameters based on noisy samples of functions generated by application of the operator, in the spirit of Robbins–Monro stochastic approximation. The algorithm is motivated by temporal-difference learning, and our developments lead to a possibly more efficient variant of temporal-difference learning. We establish convergence of the algorithm and explore efficiency gains through computational experiments involving optimal stopping and queueing problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fixed-point FPGA Implementation of a Kalman Filter for Range and Velocity Estimation of Moving Targets

Tracking filters are extensively used within object tracking systems in order to provide consecutive smooth estimations of position and velocity of the object with minimum error. Namely, Kalman filter and its numerous variants are widely known as simple yet effective linear tracking filters in many diverse applications. In this paper, an effective method is proposed for designing and implementa...

متن کامل

New three-step iteration process and fixed point approximation in Banach spaces

‎In this paper we propose a new iteration process‎, ‎called the $K^{ast }$ iteration process‎, ‎for approximation of fixed‎ ‎points‎. ‎We show that our iteration process is faster than the existing well-known iteration processes using numerical examples‎. ‎Stability of the $K^{ast‎}‎$ iteration process is also discussed‎. ‎Finally we prove some weak and strong convergence theorems for Suzuki ge...

متن کامل

Bayesian Reward Filtering

A wide variety of function approximation schemes have been applied to reinforcement learning. However, Bayesian filtering approaches, which have been shown efficient in other fields such as neural network training, have been little studied. We propose a general Bayesian filtering framework for reinforcement learning, as well as a specific implementation based on sigma point Kalman filtering and...

متن کامل

Approximate Kalman Filter Q-Learning for Continuous State-Space MDPs

We seek to learn an effective policy for a Markov Decision Process (MDP) with continuous states via Q-Learning. Given a set of basis functions over state action pairs we search for a corresponding set of linear weights that minimizes the mean Bellman residual. Our algorithm uses a Kalman filter model to estimate those weights and we have developed a simpler approximate Kalman filter model that ...

متن کامل

Approximation of a generalized Euler-Lagrange type additive mapping on Lie $C^{ast}$-algebras

Using fixed point method, we prove some new stability results for Lie $(alpha,beta,gamma)$-derivations and Lie $C^{ast}$-algebra homomorphisms on Lie $C^{ast}$-algebras associated with the Euler-Lagrange type additive functional equation begin{align*} sum^{n}_{j=1}f{bigg(-r_{j}x_{j}+sum_{1leq i leq n, ineq j}r_{i}x_{i}bigg)}+2sum^{n}_{i=1}r_{i}f(x_{i})=nf{bigg(sum^{n}_{i=1}r_{i}x_{i}bigg)} end{...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal Difference Learning

نویسندگان

چکیده

منابع مشابه

Fixed-point FPGA Implementation of a Kalman Filter for Range and Velocity Estimation of Moving Targets

New three-step iteration process and fixed point approximation in Banach spaces

Bayesian Reward Filtering

Approximate Kalman Filter Q-Learning for Continuous State-Space MDPs

Approximation of a generalized Euler-Lagrange type additive mapping on Lie $C^{ast}$-algebras

عنوان ژورنال:

اشتراک گذاری